Skip to content

unused_code: handle git grep no-matches and untracked #203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

myakove
Copy link
Collaborator

@myakove myakove commented Aug 9, 2025

detect pytest fixtures incl. param usage; fail CI on unexpected git errors; aggregate all unused functions per file; silence coverage show_contexts warning; fix jira_utils config precedence for tests

fix 186

Summary by CodeRabbit

  • New Features

    • Unused-code tool: analyze a single file or a directory; smarter detection (pytest fixtures, doc noise filtering); auto-detects grep capabilities; deterministic, sorted results; enhanced CLI options.
  • Documentation

    • Expanded README with usage examples, options, config samples, and skip annotations.
  • Tests

    • Extensive new tests and manifests covering fixtures, grep behavior, path handling, CLI flows, and edge cases.
  • Chores

    • .gitignore updated with AI-related entries; coverage HTML hides contexts.
  • Style

    • Minor whitespace cleanup.

myakove added 3 commits August 9, 2025 13:39
…fixtures incl. param usage; fail CI on unexpected git errors; aggregate all unused functions per file; silence coverage show_contexts warning; fix jira_utils config precedence for tests
…h subprocess.CompletedProcess and avoid AttributeError in _git_grep
Copy link
Contributor

coderabbitai bot commented Aug 9, 2025

Walkthrough

Enhances the unused-code analyzer with git-grep flag probing, pytest-fixture-aware usage detection, portable regex builders, and CLI options for single-file or directory analysis; updates file discovery, adds tests/manifests and README, tweaks coverage config and .gitignore, and applies a whitespace change in Jira utils.

Changes

Cohort / File(s) Summary
Unused code analyzer core
apps/unused_code/unused_code.py
Adds cached grep-flag detection, pytest fixture detection, portable usage/fixture regex builders, git-grep helpers, documentation-line filtering, expanded fixture-usage checks, refactors process_file to accumulate messages, and enhances get_unused_functions for single-file/dir analysis, sorting, and per-future error handling.
Unused code tests & manifests
tests/unused_code/test_unused_code.py, tests/unused_code/test_unused_code_coverage.py, tests/unused_code/unused_code_file_for_test.py, tests/unused_code/manifests/*, tests/unused_code/manifests/test_config.yaml
Adds extensive unit tests for fixture detection, git-grep error handling, path parsing, skip markers, CLI behavior, and coverage; introduces manifests and config YAML used by tests.
Unused code docs
apps/unused_code/README.md
Adds usage guide, CLI examples/options, config-file example, and skip-annotation guidance.
Utils helper
apps/utils.py
Changes all_python_files signature to accept optional directory, walks that target, and yields only .py file paths.
Jira utils
apps/jira_utils/jira_information.py
Adds a single blank line after url/token assignments in process_jira_command_line_config_file; no logic change.
Repo config
pyproject.toml, .gitignore
Sets coverage HTML show_contexts to false and appends AI-related ignore patterns to .gitignore.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

verified, size/M, commented-myakove

Suggested reviewers

  • rnetser
  • dbasunag

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.


📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 01995c0 and 6e62487.

📒 Files selected for processing (2)
  • apps/unused_code/unused_code.py (7 hunks)
  • tests/unused_code/test_unused_code_coverage.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/unused_code/unused_code.py
  • tests/unused_code/test_unused_code_coverage.py
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/unusedcode-handle-git-grep-no-matches-and-untracked

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@redhat-qe-bot1
Copy link
Collaborator

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Pre-commit Checks: pre-commit runs automatically if .pre-commit-config.yaml exists
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest python-module-install - Test Python package installation
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 0 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No WIP, hold, or conflict labels
  5. Verified: PR must be marked as verified (if verification is enabled)

📊 Review Process

Approvers and Reviewers

Approvers:

  • dbasunag
  • myakove
  • rnetser

Reviewers:

  • dbasunag
  • myakove
  • rnetser
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is automatically removed on each new commit
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
apps/jira_utils/jira_information.py (1)

153-160: Config-file precedence: consider falling back to CLI/env if keys are missing in the file

Current logic overwrites url/token with empty strings when a config file path is provided but the file lacks those keys, causing an immediate exit. If the intent is strict reproducibility, this is fine. If not, prefer config values when present, otherwise fall back to CLI/env to avoid surprising failures.

Proposed tweak:

-    if config_file_path:
-        url = config_dict.get("url", "")
-        token = config_dict.get("token", "")
+    if config_file_path:
+        # Prefer config values when provided, otherwise fall back to CLI/env
+        url = config_dict.get("url") or url
+        token = config_dict.get("token") or token

Please confirm which behavior you want long-term (strict vs. fallback).

tests/unused_code/test_unused_code.py (2)

47-76: Good coverage for fixture-used-as-parameter; consider adding a test for “commented usage”

This test validates the parameter-usage path for fixtures. To harden behavior, add a test ensuring a function name appearing only in a commented line isn’t treated as “used.” This will catch regressions around comment filtering.

I can draft a test that writes a file with a commented-out call and asserts the function is still reported unused.


117-126: Improve fidelity of the subprocess mock

_git_grep uses text=True, so stdout/stderr are str in reality. Returning bytes works here but diverges from actual types. Prefer strings or use subprocess.CompletedProcess for realism.

 class FakeCompleted:
     def __init__(self):
         self.returncode = 2
-        self.stdout = b""
-        self.stderr = b"fatal: not a git repository"
+        self.stdout = ""
+        self.stderr = "fatal: not a git repository"
apps/unused_code/unused_code.py (3)

34-60: Pytest fixture detection: robust and readable

Decorator handling covers both @pytest.fixture and from pytest import fixture forms, with and without parentheses. Looks good. Optional: support aliased imports (e.g., from pytest import fixture as fx) if needed in your codebase.


62-88: _git_grep: correct handling for rc=0/1 and unexpected errors

PCRE and untracked coverage are nice touches. Optional: add -I to ignore binary files for safety in mixed repos.


202-224: Exit code on processing failure: confirm desired semantics

Exceptions during file processing exit with code 2. A prior team preference (retrieved learning) was to exit(1) immediately on file read/parse failures. Do we want to standardize on 1 for any failure, or keep 1 = “unused found” vs 2 = “processing error”?

If you prefer exit(1) for these failures, I can provide a patch; otherwise, please document the exit code semantics in the CLI help for clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46ff2f7 and 6fe4908.

📒 Files selected for processing (5)
  • apps/jira_utils/jira_information.py (1 hunks)
  • apps/unused_code/unused_code.py (4 hunks)
  • pyproject.toml (1 hunks)
  • tests/unused_code/test_unused_code.py (2 hunks)
  • tests/unused_code/unused_code_file_for_test.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2024-12-29T12:05:41.200Z
Learnt from: myakove
PR: RedHatQE/python-utility-scripts#135
File: apps/unused_code/unused_code.py:61-62
Timestamp: 2024-12-29T12:05:41.200Z
Learning: When file reading or parsing fails in apps/unused_code/unused_code.py, we want to fail (use sys.exit(1)) immediately instead of continuing.

Applied to files:

  • tests/unused_code/test_unused_code.py
  • tests/unused_code/unused_code_file_for_test.py
📚 Learning: 2024-12-29T11:59:22.364Z
Learnt from: myakove
PR: RedHatQE/python-utility-scripts#135
File: apps/unused_code/unused_code.py:0-0
Timestamp: 2024-12-29T11:59:22.364Z
Learning: When a function name retrieved from the AST is not found by 'git grep', the user wants a RuntimeError raised, instead of silently returning or skipping.

Applied to files:

  • apps/unused_code/unused_code.py
🧬 Code Graph Analysis (2)
tests/unused_code/test_unused_code.py (1)
apps/unused_code/unused_code.py (3)
  • _git_grep (62-87)
  • get_unused_functions (192-227)
  • process_file (112-171)
apps/unused_code/unused_code.py (1)
apps/utils.py (1)
  • all_python_files (76-86)
🔇 Additional comments (5)
tests/unused_code/unused_code_file_for_test.py (1)

1-7: LGTM: clearer stubs with explicit return types

Switching to ellipsis stubs with None annotations is concise and keeps intent clear.

pyproject.toml (1)

48-48: Coverage HTML contexts disabled

Acknowledged. Disabling show_contexts simplifies the report. If any tooling depends on contexts, ensure it’s updated.

tests/unused_code/test_unused_code.py (3)

1-3: LGTM: imports updated for new tests and helpers

The added imports align with the new fixture and grep behavior tests.

Also applies to: 6-6


78-96: LGTM: no-match handling verified

Solid check that an empty grep result returns the “unused” message without crashing.


98-115: LGTM: autouse fixture skip verified

Accurately ensures autouse fixtures are not flagged as unused.

myakove added 3 commits August 9, 2025 14:18
…ate line content; ensure commented-line checks work reliably
…dout/stderr; fix formatting\nfix(unused_code): add -I to git grep to ignore binaries
…env for url/token to avoid unintended failures
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
apps/unused_code/unused_code.py (2)

142-143: split(":", 2) still breaks on Windows paths

The earlier review flagged that git grep output should be parsed with rsplit(":", 2) to avoid mangling drive-letter paths like C:/repo/file.py:12:.... Using split from the left produces wrong _path and _line segments, so comment filtering and “own definition” checks fail again.

-    _path, _lineno, _line = entry.split(":", 2)
+    _path, _lineno, _line = entry.rsplit(":", 2)

160-163: Same parsing issue in fixture-parameter search

Repeat the rsplit(":", 2) fix here to keep the logic consistent and correct.

🧹 Nitpick comments (2)
apps/unused_code/unused_code.py (1)

207-229: Thread-pool failure handling exits entire run

sys.exit(2) inside the for future in as_completed loop aborts on the first file that raises, hiding subsequent errors. Collect all failures and print a summary instead so users can fix multiple issues in one run.

tests/unused_code/test_unused_code.py (1)

65-71: Use path.as_posix() to avoid OS-specific path separators in mock

Building grep output with f"{py_file}:1:..." relies on the OS path format. For cross-platform safety (especially Windows), convert the Path to POSIX first: py_file.as_posix().

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6fe4908 and 261388e.

📒 Files selected for processing (3)
  • apps/jira_utils/jira_information.py (1 hunks)
  • apps/unused_code/unused_code.py (4 hunks)
  • tests/unused_code/test_unused_code.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • apps/jira_utils/jira_information.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-12-29T11:59:22.364Z
Learnt from: myakove
PR: RedHatQE/python-utility-scripts#135
File: apps/unused_code/unused_code.py:0-0
Timestamp: 2024-12-29T11:59:22.364Z
Learning: When a function name retrieved from the AST is not found by 'git grep', the user wants a RuntimeError raised, instead of silently returning or skipping.

Applied to files:

  • apps/unused_code/unused_code.py
🧬 Code Graph Analysis (1)
apps/unused_code/unused_code.py (1)
apps/utils.py (1)
  • all_python_files (76-86)

@myakove
Copy link
Collaborator Author

myakove commented Aug 9, 2025

/verified

myakove added 3 commits August 9, 2025 14:34
…ce and cache; use detected flag for portability
…table patterns for -P/-G; keep comment filtering correct
…first; show summary\ntests(unused_code): use posix path in mock grep output and align with new call pattern
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
apps/unused_code/unused_code.py (1)

203-208: Fix parsing of git-grep output; current split breaks on Windows drive letters and paths with colons

Use rsplit to reliably extract the last two fields (lineno and content). This also fixes comment filtering on the content.

-            _path, _lineno, _line = entry.split(":", 2)
+            _path, _lineno, _line = entry.rsplit(":", 2)
@@
-                _path, _lineno, _line = entry.split(":", 2)
+                _path, _lineno, _line = entry.rsplit(":", 2)

If you want, I can add a unit test asserting commented-out usages are ignored for both call-sites and fixture-parameter searches.

Also applies to: 221-228

🧹 Nitpick comments (1)
apps/unused_code/unused_code.py (1)

272-299: Small robustness improvements for concurrency and UX

  • Pre-flight grep-flag detection before spawning threads to fail fast with a clear error if unsupported.
  • Optionally stabilize output ordering for deterministic CI logs.
     if not os.path.exists(".git"):
         LOGGER.error("Must be run from a git repository")
         sys.exit(1)
 
+    # Pre-flight: detect supported grep engine once and fail-fast if unsupported
+    try:
+        _detect_supported_grep_flag()
+    except RuntimeError as e:
+        LOGGER.error(str(e))
+        sys.exit(2)
+
     with ThreadPoolExecutor() as executor:

Optional deterministic output (outside this hunk):

  • Before printing, split lines and sort:
if unused_functions:
    lines = "\n".join(unused_functions).splitlines()
    click.echo("\n".join(sorted(lines)))
    sys.exit(1)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 261388e and 9507035.

📒 Files selected for processing (2)
  • apps/unused_code/unused_code.py (6 hunks)
  • tests/unused_code/test_unused_code.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unused_code/test_unused_code.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2024-12-29T11:59:22.364Z
Learnt from: myakove
PR: RedHatQE/python-utility-scripts#135
File: apps/unused_code/unused_code.py:0-0
Timestamp: 2024-12-29T11:59:22.364Z
Learning: When a function name retrieved from the AST is not found by 'git grep', the user wants a RuntimeError raised, instead of silently returning or skipping.

Applied to files:

  • apps/unused_code/unused_code.py
🧬 Code Graph Analysis (1)
apps/unused_code/unused_code.py (1)
apps/utils.py (1)
  • all_python_files (76-86)
🔇 Additional comments (2)
apps/unused_code/unused_code.py (2)

75-101: Fixture detection logic looks solid

Covers both @pytest.fixture and from pytest import fixture with/without parentheses. Reasonable tradeoff even without alias-resolution.


146-151: Behavior change: rc=1 returns [] (no matches)

This aligns with the PR objective “handle git grep no-matches.” Note: a prior preference (from learnings) requested raising on unfound function names. Please confirm this new behavior supersedes the old expectation.

I’ll proceed assuming the new behavior is intended.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
apps/unused_code/unused_code.py (2)

264-273: Bug: incorrect parsing of git grep output; use rsplit to handle Windows paths with colons

split(":", 2) from the left breaks when paths contain colons (e.g., “C:...”); _line ends up prefixed with the line number, breaking comment filtering and other checks.

-            parts = entry.split(":", 2)
+            # git grep -n outputs: path:lineno:content; split from the right to handle colons in paths
+            parts = entry.rsplit(":", 2)
             if len(parts) != 3:
                 continue
-            _path, _lineno, _line = parts
+            _path, _lineno, _line = parts

Add a regression test that ensures commented-out usages (“# foo()”) are ignored even when line numbers are present.


295-306: Same parsing bug in fixture-parameter loop; use rsplit here as well

Mirrors the call-site parsing issue. Fix as above.

-                parts = entry.split(":", 2)
+                parts = entry.rsplit(":", 2)
                 if len(parts) != 3:
                     continue
                 _path, _lineno, _line = parts
🧹 Nitpick comments (7)
tests/unused_code/test_doc_header_false_positives.py (4)

27-60: Test name/docstring vs assertion are contradictory

Docstrings say “should not match,” yet assertions intentionally expect a match to demonstrate the current false positive. Clarify intent in the test name/docstring to avoid confusion for future readers.

-    def test_documentation_patterns_should_not_match_pcre(self):
-        """Test that PCRE patterns don't match documentation patterns."""
+    def test_pcre_call_pattern_currently_matches_doc_lines(self):
+        """Demonstrate that the current PCRE call pattern matches documentation lines (known false positive)."""

181-220: Integration test is solid; add a case for commented-out usages

Good coverage of doc-only references. Add a case where the only “usage” is commented out (e.g., “# create_namespace()”) to assert it’s ignored, preventing regressions in comment filtering.

I can add a param to documentation_matches to include a commented usage and assert it’s ignored.


231-269: Assertion condition is weak; tighten to ensure the function isn’t reported unused

The current OR condition passes if either substring is absent anywhere in the output. Make the intent explicit.

-        assert "get_pod_status" not in result or "Is not used anywhere in the code" not in result
+        assert "get_pod_status:Is not used anywhere in the code" not in result.replace("\n", "")
# Alternatively:
+# assert "get_pod_status" not in result  # if you want a stricter check

347-349: Avoid test runner invocation inside test module

Running pytest from within the test file can interfere with external runners and parallelization. Recommend removing this block.

-if __name__ == "__main__":
-    # Run tests to demonstrate the issue
-    pytest.main([__file__, "-v"])
apps/unused_code/unused_code.py (3)

126-185: Documentation filter helps significantly; consider a few minor enhancements

Current heuristics are sensible. Optional improvements:

  • Add “except ” and “try:” to code_prefixes to avoid misclassifying exceptional control flow.
  • Doc-starters list treats any line starting with “” or “-” as doc only if it contains “name(”; acceptable, but you may further require a space after the bullet to reduce rare code edge cases (e.g., “x-foo(…)”).
-    code_prefixes = ["if ", "elif ", "while ", "for ", "with ", "assert ", "return ", "yield ", "raise "]
+    code_prefixes = ["if ", "elif ", "while ", "for ", "with ", "assert ", "return ", "yield ", "raise ", "try:", "except "]

275-286: Heuristics to skip def/comment/docs are correct; consider short-circuiting to used=True

After passing all filters, you can set used = True without re-checking name containment since grep matched the call pattern already.

-            if func.name in _line:
-                used = True
-                break
+            used = True
+            break

382-386: Aggregating per-file outputs is fine; consider splitting lines before sorting for global ordering

If you ever want fully deterministic ordering across all findings (not just per-file blocks), split lines and sort globally.

-        sorted_output = sorted(unused_functions)
-        click.echo("\n".join(sorted_output))
+        all_lines: list[str] = []
+        for block in unused_functions:
+            all_lines.extend(block.splitlines())
+        click.echo("\n".join(sorted(all_lines)))
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46adfac and ec79dc8.

📒 Files selected for processing (2)
  • apps/unused_code/unused_code.py (6 hunks)
  • tests/unused_code/test_doc_header_false_positives.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/unused_code/test_doc_header_false_positives.py (1)
apps/unused_code/unused_code.py (3)
  • _build_call_pattern (100-113)
  • _is_documentation_pattern (126-184)
  • process_file (239-315)
apps/unused_code/unused_code.py (1)
apps/utils.py (1)
  • all_python_files (76-86)
🔇 Additional comments (5)
apps/unused_code/unused_code.py (5)

22-55: Portable, cached grep-flag detection LGTM

Thread-safe LRU caching, output discard, and clean error handling look good.


100-114: Call-site pattern builder is appropriate and portable

PCRE and basic-regex variants with word boundaries and optional whitespace are correct for reducing false negatives.


116-124: Fixture parameter pattern covers both engines well

Good use of \b for -P and <…> for -G, plus flexible spacing.


187-215: git grep wrapper improvements LGTM

  • Dynamic engine, -e for safety, deterministic handling of rc=0/1 vs errors: good.

346-381: Good: fail-fast grep probe and robust per-file exception handling

Pre-flight flag detection + per-future error aggregation with a distinct exit code improves CI signal. Nice.

return True

# Pattern 3: Lines that contain common documentation keywords near the function name
doc_keywords = ["Args:", "Arguments:", "Parameters:", "Returns:", "Return:", "Raises:", "Note:", "Example:"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are typically not in the same line as the function names. Doc patterns typically includes one of these string and next line may have an arg name. So I am not sure if we will get to these lines.


# Pattern 4: Check for common docstring patterns
# Lines that start with common documentation patterns
doc_starters = ['"""', "'''", "# ", "## ", "### ", "*", "-", "•"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I am calling a member function, that would include . e.g. .wait_for_status(). I may have a resource class in my repo.

r"""Build a portable regex to find a parameter named `function_name` in a def signature."""
flag = _detect_supported_grep_flag()
if flag == "-P":
return rf"def\s+\w+\s*[(][^)]*\b{function_name}\b"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would find a fixture if it appears in the first line of the test or fixture definition. How about those who appears in a different line? e.g.

 @pytest.mark.usefixtures("sa_namespace", "service_account", "mr_access_role", "mr_access_role_binding")
    def test_service_account_access_granted(
        self: Self,
        sa_token: str,
        model_registry_instance_rest_endpoint: list[str],
    ):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@myakove this is still valid. I think you need a test with a fixture that is like this

@pytest.fixture(scope="function")
def fixtureA():
    <does something>

def test_temp(
fixtureA
):
    <does something>

I did not see it getting caught.

Also how about fixtures being used via usefixtures()? I don't see (I might be missing something), that being handled either.
I did install your branch locally and tested it and both cases reported failures.

if func.name in _line:
used = True
break

# If not found and it's a pytest fixture, also search for parameter usage in function definitions
if not used and is_pytest_fixture(func=func):
param_pattern = _build_fixture_param_pattern(function_name=func.name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also look for fixtures in usefixtures()

…o fix/unusedcode-handle-git-grep-no-matches-and-untracked
@myakove
Copy link
Collaborator Author

myakove commented Aug 13, 2025

@dbasunag Moved all function to test into unused_code_file_for_test.py for better visibility, check it and let me know if your concerns are still valid.

- Rename all functions in unused_code_file_for_test.py to start with "unused_code_" prefix
- Update test references in test_doc_header_false_positives.py to use new function names
- Update documentation patterns and function calls to match new naming convention
- Ensures compliance with requirement that test functions must start with "unused_code_"
- Fix unused variable warnings in test files
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
apps/unused_code/unused_code.py (2)

269-272: Critical: Incorrect git grep output parsing causes comment detection to fail.

The current parsing splits the output with split(":", 2), but this extracts path, line number, and content. The comment detection on Line 279 then checks if _line.strip().startswith("#"), but _line contains the actual line content after the second colon, so this check works correctly.

However, there's a subtle issue: the code should consistently handle the parsing. The variable naming is misleading - you have _, _, _line = parts which suggests you're discarding the first two parts, but actually you need all three parts for proper validation.

Apply this fix for clarity:

-            parts = entry.split(":", 2)
-            if len(parts) != 3:
-                continue
-            _, _, _line = parts
+            parts = entry.split(":", 2)
+            if len(parts) != 3:
+                continue
+            _path, _lineno, _line = parts

298-301: Same parsing clarity issue in fixture parameter search.

The fixture parameter search has the same misleading variable naming, though the parsing itself is correct.

The code already correctly names the variables on Line 301, so this is actually fine. The variables are properly named as _path, _lineno, _line = parts.

🧹 Nitpick comments (2)
apps/unused_code/unused_code.py (2)

157-158: Documentation pattern detection may have false negatives.

The regex pattern rf"^\s+{re.escape(function_name)}\s*\([^)]*\)\s*:\s+\w" requires indentation at the start (^\s+). This might miss documentation patterns that start at the beginning of a line without indentation.

Consider making the indentation optional:

-    if re.search(rf"^\s+{re.escape(function_name)}\s*\([^)]*\)\s*:\s+\w", stripped_line):
+    if re.search(rf"^\s*{re.escape(function_name)}\s*\([^)]*\)\s*:\s+\w", stripped_line):

359-365: Remove commented-out debug code before merging.

There's commented-out debug code that appears to be for testing purposes. This should be removed to keep the codebase clean.

-    # res = process_file(
-    #     py_file="tests/unused_code/unused_code_file_for_test.py",
-    #     func_ignore_prefix=func_ignore_prefix,
-    #     file_ignore_list=file_ignore_list,
-    # )
-    # __import__("ipdb").set_trace()
-    # return
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ec79dc8 and f96b127.

📒 Files selected for processing (4)
  • apps/unused_code/unused_code.py (6 hunks)
  • tests/unused_code/test_doc_header_false_positives.py (1 hunks)
  • tests/unused_code/test_unused_code.py (3 hunks)
  • tests/unused_code/unused_code_file_for_test.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
tests/unused_code/test_doc_header_false_positives.py (1)
apps/unused_code/unused_code.py (3)
  • _build_call_pattern (100-113)
  • _is_documentation_pattern (126-184)
  • process_file (239-315)
tests/unused_code/test_unused_code.py (2)
apps/unused_code/unused_code.py (3)
  • _git_grep (187-214)
  • get_unused_functions (336-393)
  • process_file (239-315)
tests/utils.py (1)
  • get_cli_runner (4-5)
apps/unused_code/unused_code.py (1)
apps/utils.py (1)
  • all_python_files (76-86)
🔇 Additional comments (11)
tests/unused_code/unused_code_file_for_test.py (3)

10-18: Good addition of explicit return type annotations.

The functions now explicitly specify -> None return type annotations, which improves code clarity and enables better type checking. This is a positive change that aligns with Python typing best practices.


77-100: Documentation pattern coverage looks comprehensive.

The unused_code_check_pods function properly demonstrates real function usage by calling unused_code_get_pod_status(), ensuring it won't be marked as unused. The documentation includes a clear comment about the expected behavior.


216-218: LGTM! Proper annotation of the skip directive.

The function correctly includes the # skip-unused-code comment that will be detected by the unused code analyzer to skip this function from being reported.

apps/unused_code/unused_code.py (2)

22-55: Excellent implementation of portable grep flag detection.

The @lru_cache decorator ensures thread-safe, single execution of the detection logic. The function properly handles both PCRE and basic regex flags, uses /dev/null to avoid capturing large outputs, and provides clear error messages when neither flag is supported.


390-392: Great addition of deterministic output sorting.

Sorting the unused functions before output ensures consistent, reproducible results across CI runs, which helps with debugging and test stability.

tests/unused_code/test_unused_code.py (3)

69-74: Test correctly simulates fixture parameter usage detection.

The mock implementation properly differentiates between call-site patterns (ending with [(]) and parameter usage patterns, returning appropriate responses for each case.


156-183: Comprehensive test for path parsing edge cases.

Excellent test coverage for Windows paths with colons and other edge cases. The test ensures the parsing logic using split(":", 2) correctly handles paths that contain colons in the file path portion.


186-211: Good test for resilience against malformed git grep output.

The test properly verifies that the code gracefully handles and skips malformed git grep output lines while continuing to process valid entries.

tests/unused_code/test_doc_header_false_positives.py (3)

84-90: Good handling of regex engine differences in tests.

The test correctly handles the difference between PCRE and basic regex patterns - it tests PCRE patterns with Python's re module but skips direct pattern testing for -G patterns due to POSIX token incompatibility. This is a pragmatic approach since Python's re doesn't support POSIX character classes like [[:space:]].


92-124: Comprehensive test coverage for documentation pattern detection.

The test thoroughly covers various documentation patterns and legitimate code patterns, ensuring the _is_documentation_pattern function correctly distinguishes between them.


245-248: No changes needed to mock_git_grep

The patterns returned by _build_call_pattern always include the whitespace quantifier immediately before [(] (e.g. \s*[(] or [[:space:]]*[(]), so checking pattern.endswith("[(]") correctly matches both PCRE and basic regex call patterns.

for keyword in doc_keywords:
if keyword in stripped_line:
# If we find doc keywords and function name with parens, likely documentation
if f"{function_name}(" in stripped_line:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the key word is in the line, the function name won't be found in the same line if they are following valid docstring patterns. We typically ask people to put a new line or add the details in next line.

LOGGER.error(str(e))
sys.exit(1)

# res = process_file(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to remove these.

r"""Build a portable regex to find a parameter named `function_name` in a def signature."""
flag = _detect_supported_grep_flag()
if flag == "-P":
return rf"def\s+\w+\s*[(][^)]*\b{function_name}\b"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@myakove this is still valid. I think you need a test with a fixture that is like this

@pytest.fixture(scope="function")
def fixtureA():
    <does something>

def test_temp(
fixtureA
):
    <does something>

I did not see it getting caught.

Also how about fixtures being used via usefixtures()? I don't see (I might be missing something), that being handled either.
I did install your branch locally and tested it and both cases reported failures.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
apps/unused_code/unused_code.py (1)

465-469: Safer split of git grep output to handle rare path colons

Use rsplit(":", 2) to split from the right; this avoids edge cases if a path ever contains a colon (less likely on Windows due to relative paths, but still safer).

-            parts = entry.split(":", 2)
+            parts = entry.rsplit(":", 2)
@@
-                    parts = entry.split(":", 2)
+                    parts = entry.rsplit(":", 2)

Also applies to: 498-505

🧹 Nitpick comments (7)
apps/utils.py (2)

76-89: Prune excluded dirs in-place and normalize directory path for consistent, faster walks

  • Use in-place pruning of dirs to avoid descending into excluded trees; current substring check still traverses them and can be slow on large repos.
  • Normalize provided directory to an absolute path for consistent outputs regardless of CLI input.
  • Consider adding ".venv" to the excludes (fairly common).

Apply:

 def all_python_files(directory: click.Path | None = None) -> Iterable[str]:
     """
     Get all python files from current directory and subdirectories
     """
-    exclude_dirs = [".tox", "venv", ".pytest_cache", "site-packages", ".git"]
-    target = str(directory) if directory else os.path.abspath(os.curdir)
+    exclude_dirs = {".tox", "venv", ".venv", ".pytest_cache", "site-packages", ".git"}
+    target = os.path.abspath(str(directory)) if directory else os.path.abspath(os.curdir)
 
-    for root, _, files in os.walk(target):
-        if [_dir for _dir in exclude_dirs if _dir in root]:
-            continue
+    for root, dirs, files in os.walk(target, topdown=True):
+        # Prevent walking into excluded directories
+        dirs[:] = [d for d in dirs if d not in exclude_dirs]
         for filename in files:
             if filename.endswith(".py"):
                 yield os.path.join(root, filename)

76-83: Docstring should mention the optional directory argument and absolute yielding behavior

The docstring still reflects “current directory” only. Please clarify that passing directory scopes the walk root. If you accept the abspath change above, note that yielded paths are absolute.

tests/unused_code/manifests/functions_as_args.py (1)

5-9: Use typing.Callable instead of builtin callable in annotations

The builtin callable is a function, not a PEP 484 type. Use typing.Callable (optionally with ellipsis/return type). Future annotations won’t save you from static typing here.

-from __future__ import annotations
+from __future__ import annotations
+from typing import Any, Callable
@@
-class TimeoutSampler:
-    def __init__(self, wait_timeout: int, sleep: int, func: callable, **func_kwargs) -> None:
+class TimeoutSampler:
+    def __init__(self, wait_timeout: int, sleep: int, func: Callable[..., Any], **func_kwargs) -> None:
tests/unused_code/test_unused_code_coverage.py (2)

138-149: Enable the documentation-pattern test once the parser bug is fixed

Once _is_documentation_pattern correctly matches indented “name (type): ...” lines (see my comment in apps/unused_code/unused_code.py), this xfail can be removed to assert the behavior.

-@pytest.mark.xfail
 @pytest.mark.parametrize(
     ("line", "is_doc"),
     [
         ("    my_function (str): description", True),
         ("if my_function(): pass", False),
         ("def my_function(): pass", False),
         ("    # my_function()", True),
     ],
 )
 def test_is_documentation_pattern(line, is_doc):
     assert _is_documentation_pattern(line, "my_function") == is_doc

152-159: Assert the precise exception type from _git_grep for clearer intent

_git_grep wraps git errors in RuntimeError. Asserting Exception is broad and could mask regressions.

 with pytest.raises(Exception):
-    _git_grep("pattern")
+    _git_grep("pattern")

Follow-up:

-    with pytest.raises(Exception):
+    with pytest.raises(RuntimeError):
         _git_grep("pattern")
apps/unused_code/unused_code.py (2)

462-486: Reduce false positives: ignore string-literal-only occurrences in general usage scan

Whole-word matching is intended to catch references, but it will also treat occurrences inside string literals as “usage” (e.g., logging). For non-fixture scanning, skip lines where the match is quoted.

         # Ignore commented lines (full line or inline)
         code_part = _line.split("#", 1)[0]
         if func.name not in code_part:
             continue
+        # Ignore occurrences inside string literals like "my_function" or 'my_function'
+        if re.search(rf'(?<!\\)(["\'])\s*{re.escape(func.name)}\s*\1', code_part):
+            continue
 
         # If we are here, it's a valid usage.
         used = True
         break

Note: This does not affect fixture-specific string checks below.


573-580: Mutual exclusivity: make --file-path and --directory conflict explicit

Currently both options can be provided; code silently prefers file_path. Be explicit and fail fast to avoid user confusion.

     if file_path and not os.path.isfile(str(file_path)):
         LOGGER.error("File path must be a file, not a directory.")
         sys.exit(1)
 
     if directory and not os.path.isdir(str(directory)):
         LOGGER.error("Directory must be a directory, not a file.")
         sys.exit(1)
+
+    if file_path and directory:
+        LOGGER.error("Please pass only one of --file-path or --directory, not both.")
+        sys.exit(1)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f96b127 and 01995c0.

📒 Files selected for processing (9)
  • .gitignore (1 hunks)
  • apps/unused_code/README.md (1 hunks)
  • apps/unused_code/unused_code.py (7 hunks)
  • apps/utils.py (1 hunks)
  • tests/unused_code/manifests/functions_as_args.py (1 hunks)
  • tests/unused_code/manifests/test_config.yaml (1 hunks)
  • tests/unused_code/manifests/unused_code_file_for_test.py (1 hunks)
  • tests/unused_code/test_unused_code.py (3 hunks)
  • tests/unused_code/test_unused_code_coverage.py (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • tests/unused_code/manifests/test_config.yaml
  • apps/unused_code/README.md
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/unused_code/test_unused_code.py
  • .gitignore
🧰 Additional context used
🧬 Code graph analysis (2)
tests/unused_code/test_unused_code_coverage.py (2)
apps/unused_code/unused_code.py (15)
  • _check_fixturenames_insert_pattern (149-179)
  • _check_getfixturevalue_pattern (182-218)
  • _find_git_root (345-362)
  • _git_grep (365-404)
  • _is_documentation_pattern (284-342)
  • _is_pytest_mark_usefixtures_call (134-146)
  • _is_usefixtures_context (221-281)
  • _iter_functions (407-416)
  • _resolve_absolute_path (429-434)
  • get_unused_functions (563-635)
  • is_fixture_autouse (59-70)
  • is_ignore_function_list (419-426)
  • is_pytest_fixture (73-98)
  • process_file (437-530)
  • _detect_supported_grep_flag (24-56)
tests/utils.py (1)
  • get_cli_runner (4-5)
apps/unused_code/unused_code.py (1)
apps/utils.py (2)
  • get_util_config (14-18)
  • all_python_files (76-88)
🔇 Additional comments (2)
tests/unused_code/manifests/unused_code_file_for_test.py (1)

1-219: Good coverage of documentation-like patterns and real usages

This manifest thoughtfully exercises many doc patterns and legitimate calls. It should help stabilize false-positive filtering.

apps/unused_code/unused_code.py (1)

101-115: Confirm intent: usage pattern now matches any whole-word reference (not only call sites)

The new _build_usage_pattern switches from call-site to whole-word matching to pick up references (e.g., passing functions as args). This increases recall but may increase false positives (strings, comments). With the string-literal filter above and existing doc filtering, this seems reasonable. Please confirm this trade-off is intended.

@myakove
Copy link
Collaborator Author

myakove commented Aug 22, 2025

@dbasunag Please check again with the latest code

@myakove
Copy link
Collaborator Author

myakove commented Aug 22, 2025

/verified

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pyutils-unusedcode crashes when git grep finds no matches for pytest fixtures
7 participants